deep model transferability
Deep Model Transferability from Attribution Maps
Exploring the transferability between heterogeneous tasks sheds light on their intrinsic interconnections, and consequently enables knowledge transfer from one task to another so as to reduce the training effort of the latter. In this paper, we propose an embarrassingly simple yet very efficacious approach to estimating the transferability of deep networks, especially those handling vision tasks. Unlike the seminal work of \emph{taskonomy} that relies on a large number of annotations as supervision and is thus computationally cumbersome, the proposed approach requires no human annotations and imposes no constraints on the architectures of the networks. This is achieved, specifically, via projecting deep networks into a \emph{model space}, wherein each network is treated as a point and the distances between two points are measured by deviations of their produced attribution maps. The proposed approach is several-magnitude times faster than taskonomy, and meanwhile preserves a task-wise topological structure highly similar to the one obtained by taskonomy.
Reviews: Deep Model Transferability from Attribution Maps
The transferabilities of taskonomy have a practical value (they're constructed and are shown to reduce the need for supervision through transfer learning), but Taskonomy's method is computationally expensive. So, the gold standard is duplication of taskonomy's affinity matrix, but with less complexity. Therefore I see the comparison between the transferability matrix by attribution maps and taskonomy's (fig 4) valid and what the main point is. But I don't understand why/how SVCCA vs attribution map's similarity matrix comparisons (figure 3) are useful. What exactly is the value of SVCCA based similarity matrix? Why isn't figure 3 comparing between attribution map's matrix and Taskonomy's affinity matrix (after being made symmetric)?
Deep Model Transferability from Attribution Maps
Exploring the transferability between heterogeneous tasks sheds light on their intrinsic interconnections, and consequently enables knowledge transfer from one task to another so as to reduce the training effort of the latter. In this paper, we propose an embarrassingly simple yet very efficacious approach to estimating the transferability of deep networks, especially those handling vision tasks. Unlike the seminal work of \emph{taskonomy} that relies on a large number of annotations as supervision and is thus computationally cumbersome, the proposed approach requires no human annotations and imposes no constraints on the architectures of the networks. This is achieved, specifically, via projecting deep networks into a \emph{model space}, wherein each network is treated as a point and the distances between two points are measured by deviations of their produced attribution maps. The proposed approach is several-magnitude times faster than taskonomy, and meanwhile preserves a task-wise topological structure highly similar to the one obtained by taskonomy.